A Taste of Topological Data Analysis

Shape Reconstruction from Noisy Data

Sushovan Majhi
George Washington University, Washington DC, USA

Data Science vs Math

Today’s Agenda

  • Getting to know each other
  • Topological Data Analysis (TDA)
    • Why and How
  • The problem of shape reconstruction
  • Shape reconstruction techniques
  • Opportunities
  • Questions

About Me

Tulane University, New Orleans

PhD in Mathematics

UC Berkeley, California

Data Science Postdoc Researcher

George Washington University, DC

Assistant Professor of Data Science

Get in Touch

What is Data Science?

  • What is Data Science?

    It’s the art of learning meaningful patterns from data.

  • Who is a Data Scientist?

    You.

  • What is Topological Data Analysis (TDA)?

    TDA is a subfield of data science that focuses on the systematic learning of geometric and topological patterns of data.

Topological Data Analysis

Shape of Data

A sample

Learn Geometric Patterns (Features)

A Reconstruction

  • The topology of data is the wedge of two circles (two holes)
  • The geometry of the data is graph-like

Success of TDA

  1. The data inherit an intrinsic topological sturcture
  2. Topology-inspired methods are robust to outliers
  3. Easy to mitigate the curse of dimensionality
  4. Geometry-aware deep learning methods facilitate interpretation
    • geometricdeeplearning.com
  • Software Giotto-tda, Gudhi

Particular Application Areas

  • Shape Reconstruction
    • Manifold: Majhi (2023a)
    • Graph: Majhi (2023b)
    • More general spaces: Komendarczyk, Majhi, and Tran (2024)
  • Detection of Financial Market Crash
  • Predicting Indian Monsoon
    • Ongoing collaboration
  • Deep Learning Pipelines
    • topological deep learning

The Problem of Shape Reconstruction

The Good Shapes

Circle

Donut

The Bad Shapes

VIT Chennai Campus

A Real Application

The City of Berlin

Q: How to draw the map of the city from a noisy point-cloud of GPS locations?

A Sample Output

Source: mapconstruction.org

The Mathematical Formulation

  • Shape: A Shape is modeled as a metric space \((M,d_M)\).

    • general compact set, metric graph, Riemannian manifold.
  • Sample: A finite metric space \((X,d_X)\) close to \(M\).

    • small Hausdorff proximity if \(M\) is a Euclidean submanifold and \(X\subset\mathbb R^d\); alternatively, small Gromov–Hausdorff distance.
  • Goal: Infer the topology of \(M\) from \(X\).

    • Estimate only the Betti numbers: number of connected components, cycles, voids, etc, of \(M\).

    • construct a topological space \(\widetilde{M}\) from \(X\) to retain the topology of \(M\), i.e., \(M\simeq\widetilde{M}\).

Vietoris–Rips Complex

  • a metric space \((X,d_X)\)

  • a scale \(\beta>0\)

  • \(R_\beta(X)\) is an abstract simplicial complex such that

    • each subset \(A\subset X\) of size \(k\) with diameter at most \(\beta\) is a \((k-1)\)-simplex.

Limitations of TDA

  • TDA is theory-heavy with a steep learning curve
  • The community is strong but not as vast
  • Not applicable to data without any instrinsic geometry.
  • We need more open-source software implementation of TDA tools.

References

Opportunities

  • TDA

Questions

Thank you

Komendarczyk, Rafal, Sushovan Majhi, and Will Tran. 2024. “Topological Stability and Latschev-Type Reconstruction Theorems for \(\boldsymbol{\mathrm{CAT}(κ)}\) Spaces.” arXiv. https://doi.org/10.48550/ARXIV.2406.04259.
Majhi, Sushovan. 2023a. “Demystifying Latschev’s Theorem: Manifold Reconstruction from Noisy Data.”
arXiv:2204.14234 [Math.AT]
. https://doi.org/10.48550/ARXIV.2305.17288.
———. 2023b. “VietorisRips Complexes of Metric Spaces Near a Metric Graph.” Journal of Applied and Computational Topology, May. https://doi.org/10.1007/s41468-023-00122-z.
Rai, Anish, Buddha Nath Sharma, Salam Rabindrajit Luwang, Md Nurujjaman, and Sushovan Majhi. 2024. “Identifying Extreme Events in the Stock Market: A Topological Data Analysis.” Chaos 34 (10).